A Multimodal Real-Time MRI Articulatory Corpus for Speech Research
نویسندگان
چکیده
We present MRI-TIMIT: a large-scale database of synchronized audio and real-time magnetic resonance imaging (rtMRI) data for speech research. The database currently consists of speech data acquired from two male and two female speakers of American English. Subjects’ upper airways were imaged in the midsagittal plane while reading the same 460 sentence corpus used in the MOCHA-TIMIT corpus [1]. Accompanying acoustic recordings were phonemically transcribed using forced alignment. Vocal tract tissue boundaries were automatically identified in each video frame, allowing for dynamic quantification of each speaker’s midsagittal articulation. The database and companion toolset provide a unique resource with which to examine articulatory-acoustic relationships in speech production.
منابع مشابه
Validating rt-MRI Based Articulatory Representations via Articulatory Recognition
The large corpus of real time magnetic resonance image sequences of the vocal tract during speech production that was recently acquired and can be referred to as MRI-TIMIT, provides us with a unique platform for systematically studying articulatory dynamics. Compared to previously collected articulatory datasets, e.g., using articulography or X-rays, MRI-TIMIT is a rich source of information fo...
متن کاملAutomatic Data-Driven Learning of Articulatory Primitives from Real-Time MRI Data Using Convolutive NMF with Sparseness Constraints
We present a procedure to automatically derive interpretable dynamic articulatory primitives in a data-driven manner from image sequences acquired through real-time magnetic resonance imaging (rt-MRI). More specifically, we propose a convolutive Nonnegative Matrix Factorization algorithm with sparseness constraints (cNMFsc) to decompose a given set of image sequences into a set of basis image s...
متن کاملUSC-TIMIT: A database of multimodal speech production data
USC-TIMIT is a speech production database under ongoing development, which currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English, and electromagnetic articulography data from five of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus. In both case...
متن کاملMultimodal Fusion of Electromagnetic, Ultrasound and MRI Data for Building an Articulatory Model
Data fusion from multiple sensors is of significant interest to the speech research community, as it can potentially provide a better picture of speech production through the use of complementary sensor modalities. This paper deals with the practical aspects of this problem, such as acquisition and processing of the dynamic ultrasound (US) and electromagnetic (EM) data of the tongue during spee...
متن کاملCharacterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging.
Purpose Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided and the nature of pathomechanisms of apraxic speech...
متن کامل